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1. INTRODUCTION 

Image quality evaluation (IQE) enables computers to perceive picture quality in a way that matches 
human perception. Visual quality may be quantified and improved for numerous vision activities. The 
accessibility to the new reference picture determines the IQE approach: full-reference, reduced-reference, or 
no-reference. Despite the extensive research on the correlation between picture attributes and perceived 
quality, no study has yet examined the potential quality of a restored image. Evaluating the image's features 
is a crucial step in image processing. Because these applications must handle the best feature point 
descriptors, there is an increasing demand for best descriptors that are quick to calculate and match, memory- 
efficient, and accurate [1]. They used various tasks such as comprehensive baseline stereo matching, 
panorama stitching, 3D scene reconstruction, object, scene, texture, and gesture recognition [2]. 
Incorporating key features specific to their intended use is critical to the success of mobile applications [3]. In 
image processing,different methods, and techniques extract the features. Every application of image 
processing is gone through removing the elements of the image to get accurate results [4]. The matching 
method uses point-to-point alignments to speed up branch-and-bound [5]. A wide variety of applications 
make use of these pattern-matching algorithms. Feature identification and extraction methods increased as 
image analysis got more prominent. Scale-invariant feature transform (SIFT) [6], speeded up robust features 
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(SURF), binary robust invariant scalable keypoints (BRISK), and oriented FAST and rotated brief (ORB)[7] 
descriptors detect vital points and feature matching, as shown in Figure 1. 


Figure 1. Inlier extraction of the real-world images 


The homography matrix is a primary step of this evaluation to find the pure matches between images 
called inliers. The inliers calculation is the primary aspect for evaluating the image features. Homography 
matrix and image matching calculate inliers [8]. Matching involves in the system with the image keypoints 
and descriptors. These keypoints and descriptors come under feature extraction with the help of detectors. 
For the best image key points and descriptors, the ORB and boosted efficient bi-nary local image descriptor 
(BEBLID) detectors performed and various types of detectors likelearned arrangements of three patch codes 
(LATCH), fast retina keypoint (FREAK), BRISK, and BOOST [9] on the different real-world images, which 
include the orientation and variations of the pictures. Through those images, we extract the keypoints and 
descriptors and find the matching using the K-nearest neighbor (KNN)-matching for the pictures; with those 
matches, we calculate the inliers [10]. 

Calonder et al. [1] proposed binary robust independent elementary features (BRIEF), a very fast local 
binary descriptor, by testing with the descriptors like shift, surf, u-surf and u-shift and found the BRIEF as a fast 
binary descriptor. Kaneva et al. [2] suggested photorealistic virtual environment picture feature evaluation. 
Designing and comparing feature descriptions requires assessing their performance relative to those 
modifications. They studied descriptions' illumination, scene, and viewing sensitivity. Controls truth data from 
many locations considered under different cameras and lighting conditions are needed. Gathering such 
knowledge in real life is complicated. Igler [3] introduced a lightweight process model to help identify virtual 
user interface and application logic properties. The iterative, incremental process approach is based on 
engineering principles for software product lines and combined with design science research. Each cycle involves 
the development of several prototype versions that the client evaluates. Prototypes employ feature models. 

Remmen et al. [4] examined how real-world perturbations affect six traditional feature-based 
approaches and two convolutional neural network (CNN) methods over a year, a month, and a year. The 
number of recovered inliers and processing times utilized to assess strategies for low, medium, and high- 
complexity instances. Babu et al. [5] recognized the face expression using beizer curves and 2D image 
processing. Rosten et al. [6] suggested extra processing; efficiency determines whether the detector can function 
at the frame rate. This work presents a novel feature detection heuristic and a machine learning-based feature 
detector that can thoroughly analyze live phase alternating line (PAL) video in less than 5% of the processing 
time. Then generalize the detectors to tweak them for repeatability and efficiency. Lepetit and Fua [7] suggested 
a Statistical learning-based solution for 3D object identification and posture estimation that achieves speed and 
reliability at runtime. The wide-baseline matching problem, which this covers, has been recast as a classification 
issue that can be solved using fast and simple randomized trees. Winder and Brown [8] use an interest point 
descriptor for 3D reconstruction and age matching. They examined descriptor algorithm components and 
evaluated their combinations. SIFT, gradient location-orientation histogram (GLOH), and spin images are 
published descriptors that fit our system. To learn appropriate parameter choices for each candidate method, 
train on patches from a multi-image 3D reconstruction with reliable ground-truth matching. Jakubovicand 
Velagic [9] suggested feature matching and object recognition in two photos are solved using brute-force 
matches in this study. The recommended feature identification and descriptor extraction framework utilized 
ORB, BRISK, SIFT, and SURF. Features matched using brute-force and the KNNs method. The robust 
random sample consensus (RANSAC) technique employs acquired matches to estimate the transformation 
between two successive photos. Porav et al. [10] suggested that computer vision tasks become substantially 
more challenging when rain and lenses are present because local picture distortions start to appear. Create a 
pre-processing filter to decrease raindrops on lenses in this article. Sun et al. [11] converted daytime images 
to nighttime ones for adequate semantic segmentation. Performance depends on the percentage of synthetic 
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nighttime photos, as shown in this research. The optimal point is consistently high performance throughout 
the day and night. 

Mikolajczyk and Schmid [12] suggested the Harris-Affine detector retrieved local interest region 
descriptors and compared them to shift and distributed descriptors. Jing et al. [13] told colored binary robust 
invariant scalable key points (CBRISK). They're coloured binary key points in this publication. Due to the 
relevance of colour information in vision applications, he created CBRISK, a revolutionary key point 
identification and description approach that considers colour. Given its significance in vision applications, 
this paper introduces CBRISK, a novel approach for incorporating colour information into crucial point 
detection and description. The recommended solution uses photometric invariant colour space key points 
instead of greyscale intensity pictures [14]. The whole feature point handling pipeline, including detection, 
orientation estimation, and feature description, is implemented using a unique deep network architecture he 
describes. Li et al. [15] suggested asymmetric generative adversarial networks (GAN) for unpaired image-to- 
image translation. They claimed that asymmetric GAN offers a superior approach for odd image-to-image 
translation in asymmetric domains. 

Hua et al. [16] introduced discriminant embedding for local image descriptors. SIFT and GLOH are very 
strong for image matching and visual recognition. However, SIFT descriptors parameterize in 128-dimensional 
spaces. Winder et al. [17], often high-dimensional, investigated dimension reduction. Strecha et al. [18] studied 
using a simple method to produce a binary string using SIFT descriptor. With a tiny description, this technique 
matches well. For large-scale keypoint retrieval applications, our binary descriptor beats SIFT and digital 
accessible information system (DAISY) in the low false positive range. 

Ozuysal et al. [19] show that a simple, efficient, and resilient solution solves by phrasing the issue in 
a naive Bayesian classification framework. This classifier combines hundreds of essential binary variables 
and class posterior probability to differentiate patches around critical points. Assuming independence across 
feature sets makes the task computationally tractable. 

According to Rosten and Drummond [20], SIFT, Harris, and smallest univalue segment assimilating 
nucleus (SUSAN) provide high-quality features, but they are too computationally expensive for real-time 
applications. Machine learning uses to develop a feature detector that can thoroughly analyze live PAL video in 
less than 7% of the time available. "SURF" was suggested by Bay ef al. [21]. They provided a fast and 
performance scale and rotation-invariant interest point detector and descriptor. They asserted that the Laplacian- 
based indexing method speeds up the matching step without sacrificing performance. Agrawal et al. [22] suggested 
picture matching goal is to detect similarities between two photographs of the same setting. Object identification, 
image indexing, structure from motion, and visual localization are some uses of this core computer vision topic. 

Baumberg [23] presented reliable feature matching over widely dispersed images. Many 
orientationscompare by using different techniques. The conclusion feature matching procedure optimizes for 
structure-from-motion applications that overlook unreliable matches at the price of fewer feature matches. 
Lowry et al. [24] overview of works on visual location identification. This article discusses place recognition in 
animals, how "place" defines in robotics and the main components of a place recognition system. Long-term 
robot operations have shown that changing appearance may cause visual place identification failure. Thus,they 
discussed how to place recognition systems could implicitly or explicitly account for an appearance change. 
Shankar et al. [25] used a technique that reduced the noise in portable grey map (PGM) images by using 
object-oriented fuzzy filters and analyzed the performance using social group optimization (SGO) and 
accelerated particle swarm optimization (APSO) [26]. 

Szeliski [27] have one of the most popular consumer uses of picture registration, and blending is 
stitching many photos together to create stunning, high-resolution panoramas. They discussed direct 
intensity-based and feature-based registration methods, global and local alignment, and high-accuracy 
correspondences between overlapping pictures using panoramic image stitching motion models. After that, 
they discuss several other compositing approaches, such as multi-band and gradient-domain mixing, as well 
as how to get rid of blurry and ghosted pictures. These approaches may build high-resolution panoramas for 
static or interactive viewing. Devareddi et al. [28] have done their work on image segmentation on the 
scanned document using neural networks. Alahi et al. [29] suggested "Freak: fast retina keypoint” in 2012. In 
this study, the learning stage uses to select the most significant. Gaussian difference fits one probable 
explanation of human visual system resource optimization. This research focuses on selecting key pairings 
for future high-level applications like object recognition. 


2. METHOD 

In regards to evaluating the images, the suggested model has utilized certain objectives. We used 
four objectives in this proposed work. Figure 2 illustrates the workflow's modules. As mentioned in the 
following sections, the whole process was employed to accomplish the work. 
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Draw Matches for inliers 


Find Inliers ratio 


Figure 2. Flow of proposed work 


2.1. Objectives 

The aim of the research is to evaluate the image quality of actual images. This aim can be reached 
with the following objectives: i) keypoints and descriptors detection, ii) matching descriptors and finding 
inliers, iii) draw the matches for inliers, and iv) calculate the inlier ratio. 


2.2. Work flow 

Here Figure 2 shows the entire work flow of the method. After acquiring the images from the 
datasets, extract the features from them and classify into homographymatix and matching features. After that 
calculate inliers. From Inliers match the consider the draw matches and finally find the inliers ratio. 


2.2.1. Input image and query image 

The system requires an input image and a query image. The appropriate and efficient images as input, 
while the deviating or similar image was the query image. To read the input and query image, we need the 
“OpenCV “tool, and we use the image read function to read the pictures, as shown in Figures 3(a) and (b). 


i a b = 


@) SNMPDS Hors zcsoooh some oO! 


Figure 3. Image reading function as shown in; (a) input image and (b) query images 


2.2.2. Feature extraction 
Feature extraction involves finding the keypoints and descriptors of the input and query image using 
feature detectors. For this evaluating feature, we worked on efficient detectors like ORB, BRIEF, BRISK, 
FREAK, BEBLID, and LATCH. Each detector produces different keypoints and descriptors. 
— ORB: it is the result of combining the FAST keypoint detector with the BRIEF descriptor. The ORB takes 
the query image and converts it to grayscale. The detected keypoints are given as input to the BRIEF to 
find the descriptor. Compute the descriptors belonging to both pictures. 
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Finding key points using the FAST detector: the FAST algorithm finds the feature points of the 
image. Store a vector for each feature point's 16 pixels. Each feature point's 16 pixels uses to calculate the 
feature vector p. These 16 pixels have three states: 


d,lp >x <I, — t (Darker) 
> x= 45,l,-t <l, >x <I, + t (Similar) 
b,lp + t <I, + t (Brighter) 


Sp 


Depending on these conditions, feature vector P partitioned into three subsets Py , Ps, Pp. Keypoint k, and the 

statement is untrue if p is not an exciting point and accurate if p is? entropy calculates the interesting point. 

Finding the descriptor with BRIEF with FAST key points: 
BRIFF: this descriptor uses to detect the descriptors of the input image. In the BRIEF descriptor, the 
input image is taken and added with scale invariance and rotational invariance, but in this ORB, we use 
the FAST key points. Those keypoints generated with scale invariance and rotational invariance. After 
detecting key issues, we find patches to identify descriptors. Patches are squares that surround key points. 
To build the binary string representing a region around a key point, we must go through all pairs and 
write 1 in the binary string if the point p, has a higher intensity than point p, . The descriptors are created 
by using: 


Find (p) = > 21-1 T (p; xi yj) 
1sis ng 


The ORB produces the descriptors in an n-dimensional array of the input and query image. 

— BRISK: BRISK mainly builds the descriptors with long and short pairs. Long pairs use in BRISK to 
determine the orientation, and short pairs use for the intensity comparisons that create the descriptor. 
Comparisons of intensities must execute to construct the descriptor. The BRISK compute the exposure 
with the represented formula: 


I(p;,0;) —1(pi, %) 

IPoP) = (Popi) — 2 

a ( ‘ i) Ilp; — vill? 

Where ‘i’ is the Gaussian-smoothed intensity of the sample point by the appropriate standard deviation, and 
J(Pi, pj) is the local gradient between the sampling pair. It checks if the first point in each short pair has a 
higher smoothed intensity than the second point. BRISK builds the description using only short pairs. BRISK 
compares the collection of short pairings by rotating them by the calculated orientation: 


P pee) > (pf, 4) 
0, otherwise 


— FREAK: FREAK descriptor is a binary descriptor based on brightness comparison tests at many sample 
sites around a keypoint. The FREAK descriptor samples field centres. points p = (p;, pj), where i, j {1, 
2...., N} and i not equal to j. Binary encoded intensity comparison s(p,) is the FREAK technique. 


1, if (P; > P;) 
0, Otherwise 


s (Pa) 


To detect the descriptors, FREAK uses the represented mathematical equation: 


S 2a.s(Pa) 


0<a<N 


— BEBLID: it is boosting based binary descriptor. It uses to find the descriptors and keypoints from any 
detectors using the scale_factor argument and some difference of mean grey values in various regions of 
the images using detector keypoints. This BEBLID descriptor uses for matching and retrieval problems of 
the images. Scale_factor: it used to adjust the window around detected key points; i) 1.00f scale for ORB 
keypoints; ii) 6.75f scale for SIFT detected keypoints; iii) 6.25f default scale for KAZE, SURF detected 


Bulletin of Electr Eng & Inf, Vol. 13, No. 2, April 2024: 1172-1182 


Bulletin of Electr Eng & Inf ISSN:2302-9285 O 1177 


keypoints; and iv) 5.00f ranking for BRISK, FAST keypoints. A discriminative and fast-to-compute f(x) 
is BEBLID's secret to efficiency. f(x) is our function for removing irrelevant features. 


fæ pupa) ==> 1@- > 10] 


Where R(p, s) is a square box of size s that is centre at pixel p, and I(t) is the grey value at pixel t. f calculates 

the difference between R (p4, s) and R(p2,s)'s grey values. 

— LATCH: binary descriptors evaluate the similarities and differences of image patch triplets. LATCH 
analyses three-pixel patch intensity to create a single bit in the binary string encoding the patches. At the 
expense of a little increase in runtime processing, it outperforms current options by a large margin. 
LATCH descriptor formula: 


G(W,S;,) = i if || Pea — Pia ||27 > || Pea — P; 2ll2F 
0, Otherwise 
2.2.3. Homography matrix 
A perspective change between two planes discovered. It transforms points in one picture into 
corresponding points in another image. Vector matrix of the original plane's points, srcPoints. RANSAC 
computes a homography matrix for this implementation. dstPointstarget plane coordinates a vector matrix. 


cv2.findHomography(srcPoints, dstPoints[, method[, ransacReprojThreshold[, mask[, maxIters[, 
confidence]]]]]) 


RANSAC-Each picture pair's homography calculate using a random sample consensus approach. Each 
RANSAC cycle randomly selects four initial feature matches. Following are the steps for RANSAC: i) select 
four feature pairs (at random); ii) compute homography H (exact), iii) compute inliers, iv) keep the largest set 
of inliers, and v) re-compute the least-squares H estimate on all of the inliers. To calculate the homography 
matrix of images, we need this formula: 


X iO = max ||P, - Pyl, 1 <i <4 


where H; is the predicted homography at frame i, P; is the reference frame position of target corner j, and 
pij is the manually acquired frame i corner j location. 


2.2.4. Matching features 

To get the matching features, we use matching techniques. To match every matching descriptor in 
the Input and query image, we use the descriptor matcher and apply the brute-force hamming approach. We 
get it from the OpenCV tool. We have many matching algorithms, but we useKNN matching to match every 
nearest descriptor of the image because KNN produces the best closest matches as compared to any other 
technique. That makes the best matches for the image. For getting matches, we give the descriptors as input 
to the matcher. From the Algorithm 1, match-1, and match-2 are the mating points of the input and query 
images. In our proposed system, we implemented KNN-matching to find the matches for the input images. 


Algorithm 1: KNN-matching 

Step 1: Initially, take the descriptors of the input and query the image. 

Step 2: Take the matching ratio (0.8) for testing the values of the image descriptors. 

Step 3: Then we find the Euclidean distance for every input and query image descriptor by the formula. 


p 

F x 1 

Euclidean Distancce: dwe(i, j) = O wr (Xi — XP 
k 


Step 4: After finding the distance,the condition has to satisfy is the distance of the inputimage 
descriptor should be less than the matching ratio * distance of the query image i.e., m.distance< 
0.8 *n.distance. 

Step 5: If the condition is satisfied, append the values of the query index of keypoint! to match-1 and 
train the index of keypoint2 to match-2. 

Step 6: | Match-1 and match-2 are the matching points. 

Step 7: END the process. 
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2.2.5. Inlier calculation 

Inliers are the pure matches of the images calculated inliers of the input and query image. The 
system needs the homography matrix and the matched keypoints of the pictures. For inlier calculation, we 
use the homography check and Euclidean distanceof the matched keypoints. By the following Algorithm 2, 
we find the inliers and suitable matches of the images in our proposed work. 


Algorithm 2: Inlier calculations 
Step 1: Initially, we take an inlier threshold to identify inliers with a homography check. 


Step 2: Create the homogenous point for matched points. 

Step 3: Then project homogenous points from the input image to the query image. 

Step 4: Calculate the Euclidean distance for the matched keypoints. 

Step 5: Then follow the condition, i.e., a Euclidean distance less than the inlier threshold value. 


Step 6: If step 5 satisfies, find the length of inlierland inlier2 with the Dmatch tool of OpenCV and 
append to suitable matches. Finally, append the input match point to inliers1 and query match 
point to inliers2. 

Step 7: END the process. 


2.2.6. Draw the matches for pure inliers 

In our proposed system, we draw the pure inliers to represent the pure matches of the images. To 
represent the images, we use the OpenCV tool. In OpenCV, we have a pre-defined function CV. 
draw_matches to draw the matches of the image. Using this model, the proposed system has been 
implemented to draw the matches of the image. The function represented given: 


cv2.drawMatches(image-1, inliers1, image-2, inliers2, and good_matches) 

Model: 

Image-1: Input image of the system. 

Image-2: Query image of the system. 

Inliers-1: Input image inliers. 

Inliers-2: Query image inliers. 
Good_matches: Similar and pure matches of input and query image. 
res = np. empty((max(input_image. shape[0], QueryImage.shape[0]), input_image.shape[1] 
+ QueryImage.shape[1], 3), dtype=np.uint8) 
cv2.drawMatches(input_image,Inliers_InputImage,QueryImage, Inliers_QueryImage, 
good_matches, res) 


Code: 


2.2.7. Inlier ratio calculation 

The matching produces qualitative outcomes of the image. It is hard to determine whether our 
detector gives us the best result. To overcome it, we find the inlier ratio to represent our outcome. To 
represent out, we use the formula to find the inliers ratio. The mathematical formula for inlier ratio: 


Inliers 


P t Inliers = 100 x ———— 
ecentage of Inliers Matches 


3. IMPLEMENTATION 

The proposed system was implemented based on the feature descriptors like ORB, LATCH, 
BEBILD, FREAK, and BRISK. Along with these descriptors KNN matching algorithm and inlier calculations 
are performed on the feature descriptors. The following steps are used to implement the proposed system. 


Implementation of proposed work 


Step 1: The feature descriptors produce descriptors in n-dimensional array to detect the descriptors of 
the image, as shown in Figures 4 and 5. 
Step 2: Calculate the homography matrix with the feature detectors. As illustrated in Figure 6, we 


identify the homography matrix to compute inliers. 

Step 3: Finding the matching features of the input and query image. We find the matching features of 
the image by implementing it as shown in Figure 7. 

Step 4: Calculate the inliers of the matched keypoints with the inlier calculation algorithm. It is the 
sample implemented to get the inliers for the input and the query image; in inlier calculation, 
we calculate the good matches of the imagesfor calculating the inliers. 
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Step 5: Draw the inliers to the images to represent the pure matches of the image. In our proposed 
system, we draw the pure inliers to describe the pure matches of the images. To illustrate the 
images, we use the OpenCV tool. 

Step 6: Calculate the inlier ratio with matches and inliers of the image using the inlier ratio calculation 
formula. It is implemented in the system to calculate the inlier ratio of the images, as shown in 


Figure 8. 

[[ 15 176 251 ... 221 244 118] [[ 58 132 110 ... 10 240 34] 
[ 61 232 123 ... 155 196 226] [ 98 16 239... 8 55 251] 
[236 125 57... 91198 35] [123 123 64... 143 11 213] 
[ 47 252 225 ... 222 210 188] [ 61 249 23... 51 6 240] 
[235 222 168 ... 220 244 127] [116 249 19 ... 43 43 128] 
[131 58 179 ... 213 39 253]] [ 56 239 80 ... 158 142 139]] 


Figure 4. Input image feature descriptors Figure 5. Query image feature descriptors 


[ @1117223e+00 5.79232024e-02 -7.77440117e+00] 


[1. 
[ 1.60056844e-92 1.62378309e+00 -2.80059742e+00] 
[ 1.75088436e-04 2.57940521e-e4 1.9eQ90e0e0e+00] ] 


Figure 6. The homography matrix for the descriptors 
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Figure 7. Pure matchings of the input and query images 


Matching Results 
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# Keypoints 1: 2145 

# Keypoints 2: 1839 

# Matches: 44 

# Inliers: 29 

# Inliers Ratio: 65 . 9090909909909909 


Figure 8. Result of the input and query images 


4. RESULTS 

The proposed system tested on the different types of images. The image feature extracts using our 
system with the help of feature detectors to evaluate the image features with the matching features and 
produce the inlier ratio of the picture. The input and query images show in Figure 9. Scenarios 1-3 are the 
images considered to get an output as an inlier ratio. Figure 9 consists of three attributes: description, input, 
query image, and output. Figure 9 shows the results of the evaluated images. In Figures 10 to 12, inlier values 
used to construct graphs for scenarios 1-3. 
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Figure 10. Inlier ratios for the scenario-1 


Figure 11. Inlier ratios for the scenario-2 
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Figure 12. Inlier ratios for the scenario-3 
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5. CONCLUSION 

We use real-world scenarios for evaluating image features using various descriptors to findaccurate 
matches between input and query images. In this, we implement the performance of descriptors on similar 
images with different angles and other weather conditions. Our proposed system shows which descriptor 
accurately evaluates features like inlier ratio and matches. Working in the real world gives you complete 
control over the surroundings and awareness of the scene's geometry. Explore the effects of several 
environmental factors on descriptor isolation. According to our results, gradient descriptor efficiency for 
matching keypoints in pictures recorded under different illuminations depends on the spatial structure of the 
pooling areas. However, Issues with the amount of pooling areas need to be resolved. Images were collected 
from several camera viewpoints. Due to a lack of distinctiveness, the lower dimensional feature descriptors 
performed poorly. We ranked inliers ratios about specific image descriptors. 
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